首页> 外文OA文献 >An order optimal policy for exploiting idle spectrum in cognitive radio networks
【2h】

An order optimal policy for exploiting idle spectrum in cognitive radio networks

机译:一种利用认知无线电中空闲频谱的订单最优策略   网络

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this paper a spectrum sensing policy employing recency-based explorationis proposed for cognitive radio networks. We formulate the problem of finding aspectrum sensing policy for multi-band dynamic spectrum access as a stochasticrestless multi-armed bandit problem with stationary unknown rewarddistributions. In cognitive radio networks the multi-armed bandit problemarises when deciding where in the radio spectrum to look for idle frequenciesthat could be efficiently exploited for data transmission. We consider twomodels for the dynamics of the frequency bands: 1) the independent model wherethe state of the band evolves randomly independently from the past and 2) theGilbert-Elliot model, where the states evolve according to a 2-state Markovchain. It is shown that in these conditions the proposed sensing policy attainsasymptotically logarithmic weak regret. The policy proposed in this paper is anindex policy, in which the index of a frequency band is comprised of a samplemean term and a recency-based exploration bonus term. The sample mean promotesspectrum exploitation whereas the exploration bonus encourages for furtherexploration for idle bands providing high data rates. The proposed recencybased approach readily allows constructing the exploration bonus such that itwill grow the time interval between consecutive sensing time instants of asuboptimal band exponentially, which then leads to logarithmically increasingweak regret. Simulation results confirming logarithmic weak regret arepresented and it is found that the proposed policy provides often improvedperformance at low complexity over other state-of-the-art policies in theliterature.
机译:在本文中,提出了一种基于新近度探索的频谱感知策略,用于认知无线电网络。我们将寻找多频带动态频谱访问的Aspectrum感知策略的问题公式化为具有未知未知奖励分布的随机,不安定的多臂强盗问题。在认知无线电网络中,当决定在无线电频谱中的何处寻找可以有效用于数据传输的空闲频率时,多臂匪徒就会出现问题。我们考虑两个用于频带动态的模型:1)独立模型,其中频带状态独立于过去随机地演化; 2)Gilbert-Elliot模型,其中状态根据2状态马尔可夫链进行演化。结果表明,在这种情况下,提出的感知策略渐近对数地实现了弱后悔。本文提出的策略是索引策略,其中频段的索引由样本均值项和基于新近度的勘探奖金项组成。样本均值促进频谱利用,而探索红利鼓励进一步探索提供高数据速率的空闲频带。所提出的基于新近度的方法容易地允许构造探索奖励,使得它将以次幂方式增加次优频带的连续感测时刻之间的时间间隔,这随后导致对数地增加弱后悔。仿真结果证实了对数的弱后悔,并且发现该提议的策略通常可以以较低的复杂度提供优于文学中其他最新策略的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号